Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 128.776
Filtrar
1.
Methods Mol Biol ; 2787: 107-122, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38656485

RESUMO

Genetic diversity refers to the variety of genetic traits within a population or a species. It is an essential aspect of both plant ecology and plant breeding because it contributes to the adaptability, survival, and resilience of populations in changing environments. This chapter outlines a pipeline for estimating genetic diversity statistics from reduced representation or whole genome sequencing data. The pipeline involves obtaining DNA sequence reads, mapping the corresponding reads to a reference genome, calling variants from the alignments, and generating an unbiased estimation of nucleotide diversity and divergence between populations. The pipeline is suitable for single-end Illumina reads and can be adjusted for paired-end reads. The resulting pipeline provides a comprehensive approach for aligning and analyzing sequencing data to estimate genetic diversity.


Assuntos
Variação Genética , Genoma de Planta , Plantas , Plantas/genética , Software , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Biologia Computacional/métodos , Genômica/métodos
2.
Methods Mol Biol ; 2787: 169-181, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38656489

RESUMO

Genetic maps are an excellent tool for the analysis of important traits, the development of which is the result of the combined expression of several genes, enabling the genomic localization of the factors determining them. Such features, characterized by a normal distribution of values, are referred to as quantitative or polygenic. The analysis of their genetic background using a chromosome map is called the mapping of quantitative traits loci (QTL). QTL analysis is a statistical method of determining the genetic association of phenotypic data (trait measurements) with genotypic data (DNA markers assigned to linkage groups).There are numerous tools developed for QTL mapping. This chapter introduces Windows QTL Cartographer with Composite Interval Mapping (CIM) method, which estimates the QTL position by combining interval mapping with multiple regression. The genotypic and phenotypic data used in the exemplary QTL mapping procedure were obtained for the recombinant inbred line (RIL) population of rye. Plant height, assessed in three seasons, was the exemplary trait under study.


Assuntos
Mapeamento Cromossômico , Fenótipo , Locos de Características Quantitativas , Mapeamento Cromossômico/métodos , Genótipo , Ligação Genética , Software , Endogamia , Cromossomos de Plantas/genética
3.
Methods Mol Biol ; 2787: 153-168, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38656488

RESUMO

Genetic mapping is the determination of the position and relative genetic distance between genes or molecular markers in the chromosomes of a particular species. The construction of genetic maps uses data from the genotyping of the mapping population. Among the different mapping populations used, two are relatively common: the F2 and recombinant inbred lines (RILs) obtained as a result of the controlled crossing of genetically diverse parental forms (e.g., inbred lines). Also, the dihaploid (DH) population is often used in plants, but obtaining DHs in different crops, including rye, is very difficult or even impossible. Any molecular marker system can be used for genotyping. Polymorphic markers are used for linkage analysis, differentiating parental forms with segregation in the mapping population, consistent with the appropriate single-gene model. A genetic map is a great source of information on a species and can be an exquisite tool for analyzing important quantitative traits (QT).This chapter presents the procedure of genetic map construction with two different algorithms using the JoinMap5.0 program. First, the Materials section briefly informs about the mapping program, showing how to obtain a mapping population and prepare data for mapping. Finally, the Methods section describes the protocol for the mapping procedure itself.


Assuntos
Mapeamento Cromossômico , Ligação Genética , Locos de Características Quantitativas , Mapeamento Cromossômico/métodos , Algoritmos , Cruzamentos Genéticos , Genótipo , Marcadores Genéticos , Software , Cromossomos de Plantas/genética
4.
Methods Mol Biol ; 2787: 225-243, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38656493

RESUMO

Coffee, an important agricultural product for tropical producing countries, is facing challenges due to climate change, including periods of drought, irregular rain distribution, and high temperatures. These changes result in plant water stress, leading to significant losses in coffee productivity and quality. Understanding the processes that affect coffee flowering is crucial for improving productivity and quality. In this chapter, we describe a protocol for transcriptome analysis using available Internet software, mainly in the Galaxy Platform, using RNA-Seq data from flowers collected from different parts of the coffee tree. The methods presented in this chapter provide a comprehensive protocol for transcriptome analysis of differentially expressed genes from flowers of coffee plant. This knowledge can be utilized in coffee genetic improvement programs, particularly in the selection of cultivars that are tolerant to water deficit.


Assuntos
Coffea , Flores , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Transcriptoma , Flores/genética , Coffea/genética , Perfilação da Expressão Gênica/métodos , Transcriptoma/genética , Software , Biologia Computacional/métodos , RNA-Seq/métodos
5.
Methods Mol Biol ; 2787: 265-279, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38656496

RESUMO

Polyacrylamide gel electrophoresis (PAGE) is a widely used technique for separating proteins from complex plant samples. Prior to the analysis, proteins must be extracted from plant tissues, which are rather complex than other types of biological material. Different protocols have been applied depending on the protein source, such as seeds, pollen, leaves, roots, and flowers. Total protein amounts must also be determined before conducting gel electrophoresis. The most common methodologies include PAGE under native or denaturing conditions. Both procedures are used consequently for protein identification and characterization via mass spectrometry. Additionally, various staining procedures are available to visualize protein bands in the gel, facilitating the software-based digital evaluation of the gel through image acquisition.


Assuntos
Eletroforese em Gel de Poliacrilamida , Proteínas de Plantas , Plantas , Eletroforese em Gel de Poliacrilamida/métodos , Proteínas de Plantas/análise , Proteínas de Plantas/isolamento & purificação , Plantas/química , Proteômica/métodos , Software , Coloração e Rotulagem/métodos , Espectrometria de Massas/métodos
6.
Methods Mol Biol ; 2787: 333-353, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38656501

RESUMO

X-ray crystallography is a robust and widely used technique that facilitates the three-dimensional structure determination of proteins at an atomic scale. This methodology entails the growth of protein crystals under controlled conditions followed by their exposure to X-ray beams and the subsequent analysis of the resulting diffraction patterns via computational tools to determine the three-dimensional architecture of the protein. However, achieving high-resolution structures through X-ray crystallography can be quite challenging due to complexities associated with protein purity, crystallization efficiency, and crystal quality.In this chapter, we provide a detailed overview of the gene to structure determination pipeline used in X-ray crystallography, a crucial tool for understanding protein structures. The chapter covers the steps in protein crystallization, along with the processes of data collection, processing, structure determination, and refinement. The most commonly faced challenges throughout this procedure are also addressed. Finally, the importance of standardized protocols for reproducibility and accuracy is emphasized, as they are crucial for advancing the understanding of protein structure and function.


Assuntos
Cristalização , Conformação Proteica , Proteínas , Cristalografia por Raios X/métodos , Proteínas/química , Cristalização/métodos , Modelos Moleculares , Software
7.
Methods Mol Biol ; 2788: 157-169, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38656513

RESUMO

This chapter presents a comprehensive approach to predict novel miRNAs encoded by plant viruses and identify their target plant genes, through integration of various ab initio computational approaches. The predictive process begins with the analysis of plant viral sequences using the VMir Analyzer software. VMir Viewer software is then used to extract primary hairpins from these sequences. To distinguish real miRNA precursors from pseudo miRNA precursors, MiPred web-based software is employed. Verified real pre-miRNA sequences with a minimum free energy of < -20 Kcal/mol, are further analyzed using the RNAshapes software. Validation of predictions involves comparing them with available Expressed Sequence Tags (ESTs) from the relevant plant using BlastN. Short sequences with lengths ranging from 19 to 25 nucleotides and exhibiting <5 mismatches are prioritized for miRNA prediction. The precise locations of these short sequences within pre-miRNA structures generated using RNAshapes are meticulously identified, with a focus on those situated on the 5' and 3' arms of the structures, indicating potential miRNAs. Sequences within the arms of pre-miRNA structures are used to predict target sites within the ESTs of the specific plant, facilitated by psRNA Target software, revealing genes with potential regulatory roles in the plant. To confirm the outcome of target prediction, results are individually submitted to the RNAhybrid web-based software. For practical demonstration, this approach is applied to analyze African cassava mosaic virus (ACMV) and East African cassava mosaic virus-Uganda (EACMV-UG) viruses, as well as the ESTs of Jatropha and cassava.


Assuntos
Biologia Computacional , MicroRNAs , Vírus de Plantas , RNA Viral , Software , MicroRNAs/genética , Vírus de Plantas/genética , Biologia Computacional/métodos , RNA Viral/genética , Genes de Plantas , Conformação de Ácido Nucleico , Plantas/virologia , Plantas/genética , Etiquetas de Sequências Expressas
8.
Methods Mol Biol ; 2788: 139-155, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38656512

RESUMO

This computational protocol describes how to use pyPGCF, a python software package that runs in the linux environment, in order to analyze bacterial genomes and perform: (i) phylogenomic analysis, (ii) species demarcation, (iii) identification of the core proteins of a bacterial genus and its individual species, (iv) identification of species-specific fingerprint proteins that are found in all strains of a species and, at the same time, are absent from all other species of the genus, (v) functional annotation of the core and fingerprint proteins with eggNOG, and (vi) identification of secondary metabolite biosynthetic gene clusters (smBGCs) with antiSMASH. This software has already been implemented to analyze bacterial genera and species that are important for plants (e.g., Pseudomonas, Bacillus, Streptomyces). In addition, we provide a test dataset and example commands showing how to analyze 165 genomes from 55 species of the genus Bacillus. The main advantages of pyPGCF are that: (i) it uses adjustable orthology cut-offs, (ii) it identifies species-specific fingerprints, and (iii) its computational cost scales linearly with the number of genomes being analyzed. Therefore, pyPGCF is able to deal with a very large number of bacterial genomes, in reasonable timescales, using widely available levels of computing power.


Assuntos
Genoma Bacteriano , Filogenia , Plantas , Software , Plantas/genética , Plantas/microbiologia , Proteínas de Bactérias/genética , Genômica/métodos , Biologia Computacional/métodos , Bactérias/genética , Bactérias/classificação , Família Multigênica , Especificidade da Espécie
9.
BMC Bioinformatics ; 25(1): 163, 2024 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-38664637

RESUMO

BACKGROUND: Identifying orthologs continues to be an early and imperative step in genome analysis but remains a challenging problem. While synteny (conservation of gene order) has previously been used independently and in combination with other methods to identify orthologs, applying synteny in ortholog identification has yet to be automated in a user-friendly manner. This desire for automation and ease-of-use led us to develop OrthoRefine, a standalone program that uses synteny to refine ortholog identification. RESULTS: We developed OrthoRefine to improve the detection of orthologous genes by implementing a look-around window approach to detect synteny. We tested OrthoRefine in tandem with OrthoFinder, one of the most used software for identification of orthologs in recent years. We evaluated improvements provided by OrthoRefine in several bacterial and a eukaryotic dataset. OrthoRefine efficiently eliminates paralogs from orthologous groups detected by OrthoFinder. Using synteny increased specificity and functional ortholog identification; additionally, analysis of BLAST e-value, phylogenetics, and operon occurrence further supported using synteny for ortholog identification. A comparison of several window sizes suggested that smaller window sizes (eight genes) were generally the most suitable for identifying orthologs via synteny. However, larger windows (30 genes) performed better in datasets containing less closely related genomes. A typical run of OrthoRefine with ~ 10 bacterial genomes can be completed in a few minutes on a regular desktop PC. CONCLUSION: OrthoRefine is a simple-to-use, standalone tool that automates the application of synteny to improve ortholog detection. OrthoRefine is particularly efficient in eliminating paralogs from orthologous groups delineated by standard methods.


Assuntos
Software , Sintenia , Algoritmos , Bases de Dados Genéticas , Genômica/métodos
10.
BMC Bioinformatics ; 25(1): 166, 2024 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-38664639

RESUMO

BACKGROUND: The Biology System Description Language (BiSDL) is an accessible, easy-to-use computational language for multicellular synthetic biology. It allows synthetic biologists to represent spatiality and multi-level cellular dynamics inherent to multicellular designs, filling a gap in the state of the art. Developed for designing and simulating spatial, multicellular synthetic biological systems, BiSDL integrates high-level conceptual design with detailed low-level modeling, fostering collaboration in the Design-Build-Test-Learn cycle. BiSDL descriptions directly compile into Nets-Within-Nets (NWNs) models, offering a unique approach to spatial and hierarchical modeling in biological systems. RESULTS: BiSDL's effectiveness is showcased through three case studies on complex multicellular systems: a bacterial consortium, a synthetic morphogen system and a conjugative plasmid transfer process. These studies highlight the BiSDL proficiency in representing spatial interactions and multi-level cellular dynamics. The language facilitates the compilation of conceptual designs into detailed, simulatable models, leveraging the NWNs formalism. This enables intuitive modeling of complex biological systems, making advanced computational tools more accessible to a broader range of researchers. CONCLUSIONS: BiSDL represents a significant step forward in computational languages for synthetic biology, providing a sophisticated yet user-friendly tool for designing and simulating complex biological systems with an emphasis on spatiality and cellular dynamics. Its introduction has the potential to transform research and development in synthetic biology, allowing for deeper insights and novel applications in understanding and manipulating multicellular systems.


Assuntos
Biologia Sintética , Biologia Sintética/métodos , Modelos Biológicos , Linguagens de Programação , Biologia de Sistemas/métodos , Software
11.
Genome Biol ; 25(1): 106, 2024 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-38664753

RESUMO

Centrifuger is an efficient taxonomic classification method that compares sequencing reads against a microbial genome database. In Centrifuger, the Burrows-Wheeler transformed genome sequences are losslessly compressed using a novel scheme called run-block compression. Run-block compression achieves sublinear space complexity and is effective at compressing diverse microbial databases like RefSeq while supporting fast rank queries. Combining this compression method with other strategies for compacting the Ferragina-Manzini (FM) index, Centrifuger reduces the memory footprint by half compared to other FM-index-based approaches. Furthermore, the lossless compression and the unconstrained match length help Centrifuger achieve greater accuracy than competing methods at lower taxonomic levels.


Assuntos
Compressão de Dados , Metagenômica , Compressão de Dados/métodos , Metagenômica/métodos , Software , Genoma Microbiano , Genoma Bacteriano , Análise de Sequência de DNA/métodos
12.
BMC Bioinformatics ; 25(1): 155, 2024 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-38641616

RESUMO

BACKGROUND: Classification of binary data arises naturally in many clinical applications, such as patient risk stratification through ICD codes. One of the key practical challenges in data classification using machine learning is to avoid overfitting. Overfitting in supervised learning primarily occurs when a model learns random variations from noisy labels in training data rather than the underlying patterns. While traditional methods such as regularization and early stopping have demonstrated effectiveness in interpolation tasks, addressing overfitting in the classification of binary data, in which predictions always amount to extrapolation, demands extrapolation-enhanced strategies. One such approach is hybrid mechanistic/data-driven modeling, which integrates prior knowledge on input features into the learning process, enhancing the model's ability to extrapolate. RESULTS: We present NoiseCut, a Python package for noise-tolerant classification of binary data by employing a hybrid modeling approach that leverages solutions of defined max-cut problems. In a comparative analysis conducted on synthetically generated binary datasets, NoiseCut exhibits better overfitting prevention compared to the early stopping technique employed by different supervised machine learning algorithms. The noise tolerance of NoiseCut stems from a dropout strategy that leverages prior knowledge of input features and is further enhanced by the integration of max-cut problems into the learning process. CONCLUSIONS: NoiseCut is a Python package for the implementation of hybrid modeling for the classification of binary data. It facilitates the integration of mechanistic knowledge on the input features into learning from data in a structured manner and proves to be a valuable classification tool when the available training data is noisy and/or limited in size. This advantage is especially prominent in medical and biomedical applications where data scarcity and noise are common challenges. The codebase, illustrations, and documentation for NoiseCut are accessible for download at https://pypi.org/project/noisecut/ . The implementation detailed in this paper corresponds to the version 0.2.1 release of the software.


Assuntos
Algoritmos , Software , Humanos , Aprendizado de Máquina Supervisionado , Aprendizado de Máquina
13.
Genome Biol ; 25(1): 101, 2024 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-38641647

RESUMO

Many bioinformatics methods seek to reduce reference bias, but no methods exist to comprehensively measure it. Biastools analyzes and categorizes instances of reference bias. It works in various scenarios: when the donor's variants are known and reads are simulated; when donor variants are known and reads are real; and when variants are unknown and reads are real. Using biastools, we observe that more inclusive graph genomes result in fewer biased sites. We find that end-to-end alignment reduces bias at indels relative to local aligners. Finally, we use biastools to characterize how T2T references improve large-scale bias.


Assuntos
Genoma , Genômica , Genômica/métodos , Biologia Computacional , Mutação INDEL , Viés , Análise de Sequência de DNA/métodos , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos
14.
BMC Bioinformatics ; 25(1): 159, 2024 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-38643080

RESUMO

BACKGROUND: MicroRNAs play a critical role in regulating gene expression by binding to specific target sites within gene transcripts, making the identification of microRNA targets a prominent focus of research. Conventional experimental methods for identifying microRNA targets are both time-consuming and expensive, prompting the development of computational tools for target prediction. However, the existing computational tools exhibit limited performance in meeting the demands of practical applications, highlighting the need to improve the performance of microRNA target prediction models. RESULTS: In this paper, we utilize the most popular natural language processing and computer vision technologies to propose a novel approach, called TEC-miTarget, for microRNA target prediction based on transformer encoder and convolutional neural networks. TEC-miTarget treats RNA sequences as a natural language and encodes them using a transformer encoder, a widely used encoder in natural language processing. It then combines the representations of a pair of microRNA and its candidate target site sequences into a contact map, which is a three-dimensional array similar to a multi-channel image. Therefore, the contact map's features are extracted using a four-layer convolutional neural network, enabling the prediction of interactions between microRNA and its candidate target sites. We applied a series of comparative experiments to demonstrate that TEC-miTarget significantly improves microRNA target prediction, compared with existing state-of-the-art models. Our approach is the first approach to perform comparisons with other approaches at both sequence and transcript levels. Furthermore, it is the first approach compared with both deep learning-based and seed-match-based methods. We first compared TEC-miTarget's performance with approaches at the sequence level, and our approach delivers substantial improvements in performance using the same datasets and evaluation metrics. Moreover, we utilized TEC-miTarget to predict microRNA targets in long mRNA sequences, which involves two steps: selecting candidate target site sequences and applying sequence-level predictions. We finally showed that TEC-miTarget outperforms other approaches at the transcript level, including the popular seed match methods widely used in previous years. CONCLUSIONS: We propose a novel approach for predicting microRNA targets at both sequence and transcript levels, and demonstrate that our approach outperforms other methods based on deep learning or seed match. We also provide our approach as an easy-to-use software, TEC-miTarget, at https://github.com/tingpeng17/TEC-miTarget . Our results provide new perspectives for microRNA target prediction.


Assuntos
Aprendizado Profundo , MicroRNAs , MicroRNAs/genética , MicroRNAs/metabolismo , Redes Neurais de Computação , Software , RNA Mensageiro/genética
15.
Nat Commun ; 15(1): 3370, 2024 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-38643169

RESUMO

Residue-level coarse-grained (CG) molecular dynamics (MD) simulation is widely used to investigate slow biological processes that involve multiple proteins, nucleic acids, and their complexes. Biomolecules in a large simulation system are distributed non-uniformly, limiting computational efficiency with conventional methods. Here, we develop a hierarchical domain decomposition scheme with dynamic load balancing for heterogeneous biomolecular systems to keep computational efficiency even after drastic changes in particle distribution. These schemes are applied to the dynamics of intrinsically disordered protein (IDP) droplets. During the fusion of two droplets, we find that the changes in droplet shape correlate with the mixing of IDP chains. Additionally, we simulate large systems with multiple IDP droplets, achieving simulation sizes comparable to those observed in microscopy. In our MD simulations, we directly observe Ostwald ripening, a phenomenon where small droplets dissolve and their molecules redeposit into larger droplets. These methods have been implemented in CGDYN of the GENESIS software, offering a tool for investigating mesoscopic biological processes using the residue-level CG models.


Assuntos
Simulação de Dinâmica Molecular , Ácidos Nucleicos , Proteínas , Software
16.
PLoS One ; 19(4): e0301999, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38635686

RESUMO

To study how the nervous system processes visual information, experimenters must record neural activity while delivering visual stimuli in a controlled fashion. In animals with a nearly panoramic field of view, such as flies, precise stimulation of the entire visual field is challenging. We describe a projector-based device for stimulation of the insect visual system under a microscope. The device is based on a bowl-shaped screen that provides a wide and nearly distortion-free field of view. It is compact, cheap, easy to assemble, and easy to operate using the included open-source software for stimulus generation. We validate the virtual reality system technically and demonstrate its capabilities in a series of experiments at two levels: the cellular, by measuring the membrane potential responses of visual interneurons; and the organismal, by recording optomotor and fixation behavior of Drosophila melanogaster in tethered flight. Our experiments reveal the importance of stimulating the visual system of an insect with a wide field of view, and we provide a simple solution to do so.


Assuntos
Drosophila melanogaster , Campos Visuais , Animais , Drosophila melanogaster/fisiologia , Estimulação Luminosa , Software , Interneurônios , Voo Animal/fisiologia , Percepção Visual/fisiologia
17.
J Phys Chem B ; 128(15): 3631-3642, 2024 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-38578072

RESUMO

Parallel cascade selection molecular dynamics (PaCS-MD) is an enhanced conformational sampling method conducted as a "repetition of time leaps in parallel worlds", comprising cycles of multiple molecular dynamics (MD) simulations performed in parallel and selection of the initial structures of MDs for the next cycle. We developed PaCS-Toolkit, an optimized software utility enabling the use of different MD software and trajectory analysis tools to facilitate the execution of the PaCS-MD simulation and analyze the obtained trajectories, including the preparation for the subsequent construction of the Markov state model. PaCS-Toolkit is coded with Python, is compatible with various computing environments, and allows for easy customization by editing the configuration file and specifying the MD software and analysis tools to be used. We present the software design of PaCS-Toolkit and demonstrate applications of PaCS-MD variations: original targeted PaCS-MD to peptide folding; rmsdPaCS-MD to protein domain motion; and dissociation PaCS-MD to ligand dissociation from adenosine A2A receptor.


Assuntos
Proteínas de Transporte , Simulação de Dinâmica Molecular , Conformação Proteica , Software , Domínios Proteicos
18.
Bioinformatics ; 40(4)2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38579257

RESUMO

MOTIVATION: Spatial transcriptomics has greatly contributed to our understanding of spatial and intra-sample heterogeneity, which could be crucial for deciphering the molecular basis of human diseases. Intra-tumor heterogeneity, e.g. may be associated with cancer treatment responses. However, the lack of computational tools for exploiting cross-regional information and the limited spatial resolution of current technologies present major obstacles to elucidating tissue heterogeneity. RESULTS: To address these challenges, we introduce RegionalST, an efficient computational method that enables users to quantify cell type mixture and interactions, identify sub-regions of interest, and perform cross-region cell type-specific differential analysis for the first time. Our simulations and real data applications demonstrate that RegionalST is an efficient tool for visualizing and analyzing diverse spatial transcriptomics data, thereby enabling accurate and flexible exploration of tissue heterogeneity. Overall, RegionalST provides a one-stop destination for researchers seeking to delve deeper into the intricacies of spatial transcriptomics data. AVAILABILITY AND IMPLEMENTATION: The implementation of our method is available as an open-source R/Bioconductor package with a user-friendly manual available at https://bioconductor.org/packages/release/bioc/html/RegionalST.html.


Assuntos
Perfilação da Expressão Gênica , Software , Humanos , Perfilação da Expressão Gênica/métodos
19.
J Agric Food Chem ; 72(15): 8849-8858, 2024 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-38580310

RESUMO

Comprehensive analysis of triacylglycerol (TAG) regioisomers is extremely challenging, with many variables that can influence the results. Previously, we reported a novel algorithmic method for resolving regioisomers of complex mixtures of TAGs. In the current study, the TAG Analyzer software and its mass spectrometric fragmentation model were further developed and validated for a much wider range of TAGs. To demonstrate the method, we performed for the first time a comprehensive analysis of TAG regioisomers of bovine milk fat, a very important and one of the most complex TAG mixtures in nature containing FAs ranging from short to long carbon chains. This analysis method forms a solid basis for further investigation of TAG regioisomer profiles in various natural fats and oils, potentially aiding in the development of new and healthier foods and nutraceuticals with targeted lipid structures.


Assuntos
Leite , Espectrometria de Massas em Tandem , Animais , Triglicerídeos/química , Leite/química , Gorduras/análise , Software
20.
Bioinformatics ; 40(4)2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38588573

RESUMO

SUMMARY: Recent technical advancements in single-cell chromatin accessibility sequencing (scCAS) have brought new insights to the characterization of epigenetic heterogeneity. As single-cell genomics experiments scale up to hundreds of thousands of cells, the demand for computational resources for downstream analysis grows intractably large and exceeds the capabilities of most researchers. Here, we propose EpiCarousel, a tailored Python package based on lazy loading, parallel processing, and community detection for memory- and time-efficient identification of metacells, i.e. the emergence of homogenous cells, in large-scale scCAS data. Through comprehensive experiments on five datasets of various protocols, sample sizes, dimensions, number of cell types, and degrees of cell-type imbalance, EpiCarousel outperformed baseline methods in systematic evaluation of memory usage, computational time, and multiple downstream analyses including cell type identification. Moreover, EpiCarousel executes preprocessing and downstream cell clustering on the atlas-level dataset with 707 043 cells and 1 154 611 peaks within 2 h consuming <75 GB of RAM and provides superior performance for characterizing cell heterogeneity than state-of-the-art methods. AVAILABILITY AND IMPLEMENTATION: The EpiCarousel software is well-documented and freely available at https://github.com/biox-nku/epicarousel. It can be seamlessly interoperated with extensive scCAS analysis toolkits.


Assuntos
Cromatina , Análise de Célula Única , Software , Cromatina/metabolismo , Análise de Célula Única/métodos , Humanos , Genômica/métodos , Biologia Computacional/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA